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Abstract. Rectangular layouts, subdivisions of an outer rectangle into smaller rectangles, have many 
applications in visualizing spatial information, for instance in rectangular cartograms in which the 
rectangles represent geographic or political regions. A spatial treemap is a rectangular layout with a 
hierarchical structure: the outer rectangle is subdivided into rectangles that are in turn subdivided into 
smaller rectangles. We describe algorithms for transforming a rectangular layout that does not have this 
hierarchical structure, together with a clustering of the rectangles of the layout, into a spatial treemap 
that respects the clustering and also respects to the extent possible the adjacencies of the input layout. 



1 Introduction 

Spatial treemaps are an effective technique to visualize two-dimensional hierarchical information. They 
display hierarchical data by using nested rectangles in a space-filling layout. Each rectangle represents a 
geometric or geographic region, which in turn can be subdivided recursively into smaller regions. On lower 
levels of the recursion, rectangles can also be subdivided based on non-spatial attributes. Typically, at the 
lowest level some attribute of interest of the region is summarized by using properties like area or color. 
Treemaps were originally proposed to represent one-dimensional information in two dimensions (14). How- 
ever, they are well suited to represent spatial — two-dimensional — data because the containment metaphor of 
the nested rectangles has a natural geographic meaning, and two-dimensional data makes an efficient use of 
space |T8) . 

Spatial treemaps are closely related to rectangular cartograms fT3]: distorted maps where each region is 
represented by a rectangle whose area corresponds to a numerical attribute such as population. Rectangular 
cartograms can be seen as spatial treemaps with only one level; multi-level spatial treemaps in which every 
rectangle corresponds to a region are also known as rectangular hierarchical cartograms ri5"161. Spatial 
treemaps and rectangular cartograms have in common that it is essential to preserve the recognizability of 
the regions shown pT) . Most previous work on spatial treemaps reflects this by focusing on the preservation 
of distances between the rectangular regions and their geographic counterparts (that is, they minimize the 
displacement of the regions). However, often small displacement does not imply recognizability (swapping 
the position of two small neighboring countries can result in small displacement, but a big loss of recog- 
nizability). In the case of cartograms, most emphasis has been put on preserving adjacencies between the 
geographic regions. It has also been shown that while preserving the topology it is possible to keep the 
displacement error small p][T7). 

In this paper we are interested in constructing high-quality spatial treemaps by prioritizing the preservation 
of topology, following a principle already used for rectangular cartograms. Previous work on treemaps 
has recognized that preserving neighborhood relationships and relative positions between the regions were 
important criteria |[7] |TT][T8| , but we are not aware of treemap algorithms that put the emphasis on preserving 
topology. 

The importance of preserving adjacencies in spatial treemaps can be appreciated by viewing a concrete 
example. Figure [T] from [151 , shows a spatial treemap of property transactions in London between 2000 and 
2008, with two levels formed by the boroughs and wards of London and colors representing average prices. 
To see whether housing prices of neighboring wards are correlated, it is important to preserve adjacencies: 
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Fig. 1. A 2-level spatial treemap from flS]; used with permission. 




Fig. 2. (a) An example input: a full layout of the bottom level, but the regions at a higher level in the hierarchy 
are not rectangles, (b) The desired output: another layout, in which as many lower-level adjacencies as possible 
have been kept while reshaping the regions at a higher level into rectangles. 



otherwise it is easy to draw incorrect conclusions, like seeing clusters that do not actually exist, or missing 
existing ones. 

Preserving topology in spatial treemaps poses different challenges than in (non-hierarchical) rectangular 
cartograms. Topology-preserving rectangular cartograms exist under very mild conditions and can be 
constructed efficiently ||3||T7). As we show in this paper, this is not the case when a hierarchy is added to the 
picture. 

In this paper we consider the following setting: the input is a hierarchical rectangular subdivision with two 
levels. We consider only two levels due to the complexity of the general m-level case. However, the two-level 
case is interesting on its own, and applications that use only two-level data have recently appeared fTS) . 

Furthermore, we adopt a 2-phase approach for building spatial treemaps. In the first phase, a base rectangular 
cartogram is produced from the original geographic regions. This can be done with one of the many 
algorithms for rectangular cartograms |^ |. The result will contain all the bottom-level regions as rectangles, 
but the top-level regions will not be rectangular yet, thus will not represent the hierarchical structure. In the 
second phase, we convert the base cartogram into a treemap by making the top-level regions rectangles. It is 
at this stage that we intend to preserve the topology of the base cartogram as much as possible, and where 
our algorithms come in. See Figure|2]for an example. 

The advantage of this 2-phase approach is that it allows for customization and user interaction. Interactive 
exploration of the data is essential when visualizing large amounts of data. The freedom to use an arbitrary 
rectangular layout algorithm in the first phase of the construction allows the user to prioritize the adjacencies 
that he or she considers most essential. In the second phase, our algorithm will produce a treemap that wiU 
try to preserve as many as the adjacencies in the base cartogram as possible. 
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In addition, we go one step further and consider preserving the orientations of the adjacencies in the base 
cartogram (that is, whether two neighboring regions share a vertical or horizontal edge, and which one is 
on which side). This additional constraint is justified by the fact that the regions represent geographic or 
political regions, and relative positions between regions are an important factor when visualizing this type 
of data I? 17 1. The preservation of orientations has been studied for cartograms Q, but to our knowledge. 



this is the first time they are considered for spatial treemaps. 

We can distinguish three types of adjacency-relations: (i) top-level adjacencies, (ii) internal bottom-level 
adjacencies (adjacencies between two rectangles that belong to the same top-level region), and (iii) external 
bottom-level adjacencies (adjacencies between two rectangles that belong to different top-level regions). 
As we argue in the next section, we can always preserve all adjacencies of types (i) and (ii) under a mild 
assumption, hence the objective of our algorithms is to construct treemaps that preserve as many adjacencies 
of type (iii) as possible. We consider several variants of the problem, based on whether the orientations 
of the adjacencies have to be preserved, and whether the top-level layout is given in advance. In order to 
give efficient algorithms, we restrict ourselves to top-level regions that are orthogonally convex. This is a 
technical limitation that seems difficult to overcome, but that we expect does not limit the applicability of 
our results too much: our algorithms should still be useful for many practical instances, for example, by 
subdividing non-convex regions into few convex pieces. 

Results In the most constrained case in which adjacencies and their orientations need to be preserved and 
the top-level layout is given, we solve the problem in 0{n) time, where n is the total number of rectangles. 
The case in which the global layout is not fixed is much more challenging: it takes a combination of several 
techniques based on regular edge labelings to obtain an algorithm that solves the problem optimally in 
0{k^logk + n) time, for k the number of top-level regions; we expect k to be much smaller than n. Finally, 
we prove that the case in which the orientations of adjacencies do not need to be preserved is NP-hard; we 
give worst-case bounds and an approximation algorithm. 



2 Preliminaries 

Rectangles and Subdivisions All geometric objects like rectangles and polygons in this paper are defined 
as rectilinear (axis- aligned) objects in the Euclidean plane K^. We work in the Euclidean plane M?. A 
rectangle in this text is always an axis-aligned rectangle. We also define polygon, convex, etc. in a rectilinear 
sense. A set of rectangles TZ is called a rectangle complex if the interiors of none of the rectangles overlap, 
and each pair of rectangles is either completely disjoint or shares part of an edge; no two rectangles may 
meet in a single point. Each rectangle of a rectangle complex is a cell of that complex. We represent rectangle 
complexes using a structure that has bidirectional pointers between neighboring cells. 

Let 7?^ be a rectangle complex. The boundary of TZ is the boundary of the the union of the rectangles in TZ. 
Note that this is always a proper polygon, but it could have multiple components and holes. We say that TZ is 
simple if its boundary is a simple polygon, i.e., it is connected and has no holes. We say that TZ is convex if 
its boundary is orthogonally convex, i.e., the intersection of any horizontal or vertical line with TZ is either 
empty or a single line segment. We say that TZ is rectangular if its boundary is a rectangle. 

Let 72.' be another rectangle complex. We say that 72.' is an extension of TZ if there is a bijective mapping 
between the cells in TZ and TZ' that preserves the adjacencies and their orientations. Note that TZ' could have 
adjacencies not present in TZ though. We say that TZ' is a simple extension of 72. if 72, is not simple but TZ' is; 
similarly we may call it a convex extension or a rectangular extension. 

We show that every rectangle complex has a rectangular extension. 

Lemma 1. Let TZ be a rectangle complex. There always exists a rectangular extension ofTZ. 

Proof. We first augment TZ by four rectangles forming a bounding box of TZ. Our goal is to extend the 
complex so that no holes inside the bounding box remain, while all existing adjacencies are preserved with 
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their orientation. Obviously each hole is formed by at least four adjacent rectangles. Let H he a hole of 
the augmented complex. If there is a rectangle R adjacent to H with a full rectangle edge, we can extend R 
into the hole until it touches another rectangle. This either closes the hole or splits it into holes of lower 
complexity. Now let's assume that there is a hole without a rectangle adjacent to it along a full edge. Then 
each edge of the hole is a partial rectangle edge blocked by another rectangle. This is only possible in a 
"windmill" configuration of four rectangles cyclically blocking each other But such a hole can be removed 
by moving two opposite rectangles toward each other while shrinking the other two rectangles. See Figure[3] 
None of the operations removes adjacencies or changes their orientations. □ 

We define D = {left, right, top, bottom} to be the set of the four cardinal directions. For a direction d £D 
we use the notation —d to refer to the direction opposite from d. We define an object O cM? to be extreme 
in direction d with respect to a rectangle complex TZ if there is a point in O that is at least as far in direction 
d as any point in TZ. Let R GTZhea cell, and c/ e D a direction. We say R is d-extensible if there exists a 
rectangular extension TZi of TZ in which R is extreme in direction d with respect to TZ' (or in other words, if 
its t/-side is part of the boundary of TZ'). 

A set of simple rectangle complexes C is called a (rectilinear) layout if the boundary of the union of all 
complexes is a rectangle, the interiors of the complexes are disjoint, and no point in C belongs to more than 
three cells. If all complexes are rectangular we say that £ is a rectangular layout. We call the rectangle 
bounding C the root box. 

Let £ be a rectilinear layout. We define the global layout C' of C as the subdivision of the root box of C, in 
which the {global) regions are defined by the boundaries of the complexes in C. We say C is rectangular if 
all regions in C' are rectangles. 

Dual Graphs of Rectangle Complexes The dual graph of a rectangular complex is an embedded planar 
graph with one vertex for every rectangle in the complex, and an edge between two vertices if the correspond- 
ing rectangles touch (have overlapping edge pieces). The extended dual graph of a rectangular complex with 
a rectangular boundary has four additional vertices for the four sides of the rectangle, and an edge between 
a normal vertex and an additional vertex if the corresponding rectangle touches the corresponding side of 
the bounding box. We will be using dual graphs of the whole rectangular layout, of individual complexes, 
and of the global layout (ignoring the bottom level subdivision); Figure]?] shows some examples. Extended 
dual graphs of rectangular rectangle complexes are fully triangulated (except for the outer face which is a 
quadrilateral), and the graphs that can arise in this way are characterized by the following lemma |[8] [T0l[T7) : 
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Fig. 5. Not all external adjacencies can be kept. 

Lemma 2. A triangulated plane graph G with a quadrilateral outer face is the dual graph of a rectangular 
rectangle complex if and only if G has no separating triangles. 

Now, consider the three types of adjacencies we wish to preserve: 1) (top-level) adjacencies between global 
regions, 2) internal (bottom-level) adjacencies between the cells in one rectangle complex, and 3) external 
(bottom-level) adjacencies between cells of adjacent rectangle complexes. 

Observation 1. It is always possible to keep all internal bottom-level adjacencies. 

Observation 2. It is possible to keep all top-level adjacencies if and only if the extended dual graph of the 
global input layout has no separating triangles. 

Observation 1 follows by applying Lemma[T]to all regions, and Observation 2 follows from Lemma|2]since 
the extended dual graph of the global regions is fully triangulated. 

From now on we assume that the dual graph of the global regions has no separating triangles, and we will 
preserve all adjacencies of types 1 and 2. Unfortunately, it is not always possible to keep adjacencies of type 
3 — see Figure [5] — and for every adjacency of type 3 that we fail to preserve, another adjacency that was not 
present in the original layout will appear. Therefore, our aim is to preserve as many of these adjacencies as 
possible. 



3 Preserving orientations 

We begin studying the version of the problem where all internal adjacencies have to be preserved respecting 
their original orientations. Additionally, we want to maximize the number of preserved and correctly oriented 
(bottom-level) external adjacencies. We consider two scenarios: first we assume that the global layout is part 
of the input, and then we study the case in which we optimize over all global layouts. The former situation 
is particularly interesting for GIS applications, in which the user specifies a certain global layout that needs 
to be filled with the bottom-level cells. If, however, the bottom-level adjacencies are more important, then 
optimizing over global layouts allows to preserve more external adjacencies. 

3.1 Given the global layout 

In this section we are given, in addition to the initial two-level subdivision C, a global target layout The 
goal is to find a two-level treemap that preserves all oriented bottom-level internal adjacencies and that 
maximizes the number of preserved oriented bottom-level external adjacencies in the output. 

First observe that in the rectangular output layout any two neighboring global regions have a single 
orientation for their adjacency. Hence we can only keep those bottom-level external adjacencies that have 
the same orientation in the input as their corresponding global regions have in the output layout. Secondly, 
consider a rectangle in a complex TZ, and a rectangle B in another complex B. Observe that if R and B are 
adjacent in the input, for example with R to the left of B, then their adjacency can be preserved only if R is 
right-extensible in TZ and B is left-extensible in B. 
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Fig. 6. (a) A region in the input, (b) The same region in the given global layout, (c) Edges of rectangles that 
want to become part of a boundary have been marked with arrows. Note that one rectangle wants to become 
part of the top boundary but can't, because it is not extensible in that direction, (d) All arrows that aren't 
blocked can be made happy. 



The main result in this section is that the previous two conditions are enough to describe all adjacencies 
that cannot be preserved, whereas all the other ones can be kept. Furthermore, we will show how to 
decide extensibility for convex complexes, and how to construct a final solution that preserves all possible 
adjacencies, leading to an algorithm for the optimal solution. 

Recall that we assume all regions are orthogonally convex. Consider each rectangle complex of C separately. 
Since we know the required global layout and since all cells externally adjacent to our region are consecutive 
along its boundary, we can immediately determine the cells on each of the four sides of the output region 
(see Figure [6|. The reason is that for a rectangle R that is exterior to its region TZ, and that is adjacent to 
another rectangle B £ B, their adjacency is relevant only if TZ and B are adjacent with the same orientation 
in the global layout. We can easily categorize the extensible rectangles of a convex rectangle complex. 

Lemma 3. Let TZ be a convex rectangle complex, let R £ TZ be a rectangle, and G D a direction. R is 
d-extensible if and only if there is no rectangle R' £TZ directly adjacent to R on the d-side ofR. 

Proof For the 'only if part, simply note that if there is such a rectangle R' S TZ, then the adjacency between 
R and R' must be preserved, with its original orientation. Hence there is always a point in R' that is further in 
direction d than any point in R. So R is not t/-extensible. 

For the 'if part, consider the complex obtained by extending R in direction d until it becomes extreme in 
that direction. This is always possible because TZ is convex; the resulting complex is still simple. Now we 
add a temporary bounding box consisting of four rectangles around TZ, one in each direction, such that R is 
adjacent to the one on the d-sids. Then we can apply Lemma[T]and find a rectangular extension of TZ where 
R is extreme on the d-side. □ 

Unfortunately, though, we cannot extend all extensible rectangles at the same time. However, we show that 
we can actually extend all those rectangles that we want to extend for an optimal solution. 

We call a rectangle of a certain complex belonging to a global region engaged if it wants to be adjacent to a 
rectangle of another global region, and the direction of their desired adjacency is the same as the direction 
of the adjacency between these two regions in the global layout. We say it is (i-engaged if this direction is 

d£D. 

Therefore, the rectangles that we want to extend are exactly those that are c/-extensible and cZ-engaged, since 
they are the only ones that help preserve bottom- level exterior adjacencies. It turns out that extending all 
these rectangles is possible, because the engaged rectangles of TZ have a special property: 

Lemma 4. If we walk around the boundary of a region TZ, we encounter all d-engaged rectangles consecu- 
tively. 

Proof. Suppose that when walking clockwise along the boundary of TZ we encounter rectangles Ri,R2,Ri 
that are d-,d'-, and t/-engaged, respectively. Since Ri and Rt, are both t/-engaged, in the global layout they 
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have the same direction of external adjacency. However, if d' ^ d, then R2 has a different direction, implying 
that in the global layout this is also the same way. This contradicts the fact that in the global layout 7?. is a 
rectangle, so the rectangles engaged in the four different directions appear contiguously. □ 

This property of cZ-engaged rectangles is useful due to the following fact. 

Lemma 5. Let TZ be a convex rectangle complex composed of r rectangles, and let S be a subset of the 
extensible and engaged rectangles in TZ with the property that if we order them according to a clockwise 
walk along the boundary ofTZ, all d-extensible rectangles in S are encountered consecutively for each g D 
and in the correct clockwise order We can compute, in 0{r) time, a rectangular extension TZ' ofTZ in which 
all d-extensible rectangles in S are extreme in direction d, for all d £fi>. 

Proof We use the same idea as in the proof of Lemma[3] but now we extend all rectangles in S at the same 
time. Since by Lemma|4]ii-engaged rectangles appear consecutively around the boundary of TZ, there cannot 
be any conflicts preventing the extension of cZ-engaged rectangles: rectangles with the same direction extend 
all toward the same side, thus they can all be made extreme. On the other hand, two rectangles extended 
toward different directions cannot influence each other because that would imply that the directions do not 
appear contiguously or in the wrong order. 

It remains to apply Lemma[T]and show that the rectangular extension can be found in linear time. Since 
TZ with the rectangles in S extended is still a simple polygon, all holes of the complex after augmenting it 
by the four external rectangles are adjacent to the external rectangles. Hence there are no windmill holes. 
We can then start walking clockwise along the boundary of TZ at the first t/-extended rectangle and extend 
all t/-extensible rectangles until we reach the first d+ 1-extended rectangle. None of these rectangles is 
blocked in direction d. We close the comer between the d-side and the d + 1-side by extending either the last 
c/-extended rectangle in direction d+l or vice versa, which is always possible since they cannot both block 
each other. We continue this process along all four sides of TZ. Let Hhe a remaining hole on the li-side. It 
has the property that none of its adjacent rectangles is c/-extensible and that it is bounded by two staircases. 
We can then close the hole in linear time by simultaneously walking along the two staircases and maximally 
extending rectangles orthogonally to direction d. □ 

Therefore, the engaged and extensible rectangles form a subset of rectangles for which Lemma|5]holds, thus 
by using the lemma we can find a rectangular extension where all extensible and engaged rectangles are 
extreme in the appropriate direction. 

Then we can apply this idea to each region. Now we still have to match up the adjacencies in an optimal way, 
that is, preserving as many adjacencies from the input as possible. This can be done by matching horizontal 
and vertical adjacencies independently. It is always possible to get all the external bottom-level adjacencies 
that need to be preserved. This can be seen as follows (see also Figure[7|. We process first all horizontal 
adjacencies. Consider a complete stretch of horizontal boundary in the global layout. Then the position and 
length of the boundary of each region adjacent to that boundary are fixed, from the global layout. The only 
freedom left is in the jc-coordinates of the vertical edges of the rectangles that form part of that boundary 
(except for the leftmost and rightmost borders of each region, which are also fixed). Since the adjacencies 
that want to be preserved are part of the input, it is always possible to set the jc-coordinates in order to fulfill 
them all. The same can be done with all horizontal boundaries. The vertical boundaries are independent, 
thus can be processed in exactly the same way. This yields the main theorem in this subsection. 

Theorem 1. Let T be a 2-level treemap, where n is the number of cells in the bottom level, and where 
all global regions are orthogonally convex. For a given global target layout C, we can find, in 0{n) time, 
a rectangular layout ofT that respects C, preserves all oriented internal bottom-level adjacencies, and 
preserves as many oriented external bottom-level adjacencies as possible. 
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(a) (b) (c) 

Fig. 7. (a) After we solved all the different colors separately, we don't necessarily have the right adjacencies yet. 
(b) We can indicate their desired adjacencies that are still possible (so, the adjacencies between two edges of 
rectangles that actually ended up on the outside) as a graph. Note that this graph is planar, (c) We can poke 
around in the insides of the rectangles to make all desired adjacencies happen. 

3.2 Unconstrained global layout 

In this section the global target layout of the rectangle complexes is not given, i.e., we are given a rectilinear 
input layout and need to find a rectangular output layout preserving all adjacencies of the rectangle complexes 
and preserving a maximum number of adjacencies of the cells of different complexes. 

We can represent a particular rectangular global layout £ as a regular edge labeling Q of the dual graph 
G{C) of the global layout. Let G{C) be the extended dual graph of C. Then C induces an edge labeling as 
follows: an edge corresponding to a joint vertical (horizontal) boundary of two rectangular complexes is 
colored blue (red). Furthermore, blue edges are directed from left to right and red edges from bottom to 
top. Clearly, the edge labeling obtained from £ in this way satisfies that around each inner vertex v of G{C) 
the incident edges with the same color and the same direction form contiguous blocks around v. The edges 
incident to one of the external vertices {l^t^r^b} all have the same label. Such an edge labeling is called 
regular \9\. Each regular edge labeling of the extended dual graph G{C) defines an equivalence class of 
global layouts. 

In order to represent the family of all possible rectangular global layouts we apply a technique described by 
Eppstein et al. QjsJ. Let L be the rectilinear global input layout and let G(£) be its extended dual graph. 
The first step is to decompose G(£) by its separating 4-cycles into minors called separation components 
with the property that they do not have non-trivial separating 4-cycles any more, i.e., 4-cycles with more than 
a single vertex in the inner part of the cycle. If C is a separating 4-cycle the interior separation component 
consists of C and the subgraph induced by the vertices interior to C. The outer separation component is 
obtained by replacing all vertices in the interior of C by a single vertex connected to each vertex of C. This 
decomposition can be obtained in linear time [5]. We can then treat each component in the decomposition 
independently and finally construct an optimal rectangular global layout from the optimal solutions of its 
descendants in the decomposition tree. So let's consider a single component of the decomposition, which by 
construction has no non-trivial separating 4-cycles. 



Preprocessing of the bottom level We start with a preprocessing step to compute the number of realizable 
external bottom-level adjacencies for pairs of adjacent global regions. This allows us to ignore the bottom- 
level cells in later steps and to focus on the global layout and orientations of global adjacencies. 

Let £ be a global layout, let TZ and S be two adjacent rectangle complexes in £, and let t/ G D be an 
orientation. Then we define (o{TZ,S,d) to be the total number of adjacencies between cZ-engaged and 
(^-extensible rectangles in TZ and — c/-engaged and — c/-extensible rectangles in S. By Lemma|5]there is a 
rectangular layout of TZ and S with exactly co{TZ,S,d) external bottom-level adjacencies between TZ and S. 

We show the following (perhaps surprising) lemma: 
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Lemma 6. For any pair C and CJ of global layouts and any pair TZ and S of rectangular rectangle 
complexes, whose adjacency direction with respect to TZ is d in C and d' in C! the number of external bottom 
level adjacencies between TZ and S in any optimal solution for C' differs by CO(TZ,S ,d') — CO(TZ,S ,d) from 
C For adjacent rectangle complexes whose adjacency direction is the same in both global layouts the 
number of adjacencies in any optimal solution remains the same. 

Proof. The value co{TZ,S,d) is the maximum number of external bottom-level adjacencies between TZ and 
S that can be realized if S is adjacent to TZ in direction d. By Lemma|5]there is a rectangular extension 
of the global layout £ in which this number of adjacencies between TZ and S is realized. So clearly the 
difference in TZ-S adjacencies is co{TZ,S,d^) — co{TZ,S ,d). Adjacent pairs of rectangle complexes whose 
adjacency direction remains the same are not affected by changes of adjacency directions elsewhere in the 
global layout. □ 

This basically means we can consider changes of adjacency directions locally and independent from the rest 
of the layout. Furthermore, since the values (o{TZ,S,d) are directly obtained from counting the numbers of 
li-extensible and cZ-engaged rectangles in TZ (or — cZ-extensible and — t/-engaged rectangles in S) we get the 
next lemma. 

Lemma 7. We can compute all values Co{TZ,S,d) in 0{n) total time. 

Optimizing in a graph without separating 4-cycles Here we will prove the following: 

Theorem 2. Let G be an embedded triangulated planar graph with k' vertices without separating It-cycles 
and without non-trivial separating 4-cycles, except for the outer face which consists of exactly four vertices. 
Furthermore, let a weight CO{e,d) be assigned to every edge e in G and every orientation d in D. Then we 
can find a rectangular subdivision of which G is the extended dual that maximizes the total weight of the 
directed adjacencies in 0{k''^\ogk') time. 

In order to optimize over all rectangular subdivisions with the same extended dual graph we make use of 
the representation of these subdivisions as elements in a distributive lattice or, equivalently, as closures in 
a partial order induced by this lattice ||4]|5j. There are two moves or flips by which we can transform one 
rectangular layout (or its regular edge labeling) into another one, edge flips and vertex flips (Figure|8]). They 
form a graph where each equivalence class of rectangular layouts is a vertex and two vertices are connected 
by an edge if they are transformable into each other by a single move, with the edge directed toward the 
more counterclockwise layout with respect to this move. This graph is acyclic and its reachability ordering 
is a distributive lattice ||6.|. It has a minimal (maximal) element that is obtained by repeatedly performing 
clockwise (counterclockwise) moves. 

By Birkhoff's representation theorem Q each element in this lattice is in one-to-one correspondence to a 
partition of a partial order T' into an upward-closed set U and a downward-closed set L. The elements in 
T' are pairs (x, /), where x is a flippable item, i.e., either the edge of an edge flip or the vertex of a vertex 
flip ||4][5|. The integer / is the so-called flipping number fx{C) of x in a particular layout C, i.e., the well- 
defined number of times flip x is performed counterclockwise on any path from the minimal element Cmin to 
£ in the distributive lattice. An element (x, /) is smaller than another element {y, j) in this order if y cannot 
be flipped for the j-th time before x is flipped for the i-th time. For each upward- and downward-closed 
partition U and L, the corresponding layout can be reconstructed by performing all flips in the lower set L. 
T' has 0{k'^) vertices and edges and can be constructed in 0{k'^) time \^^. The construction starts with an 
arbitrary layout, performs a sequence of clockwise moves until we reach £minj and from there performs a 
sequence of counterclockwise moves until we reach the maximal element. During this last process we count 
how often each element is flipped, which determines all pairs (x, /) of T^. Since each flip (x, /) affects only 
those flippable items that belong to the same triangle as x, we can initialize a queue of possible flips, and 
iteratively extract the next flip and add the new flips to the queue in total time 0{k'^). In order to create the 
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(b) vertex flip E 



Fig. 8. Flip operations 





Fig. 9. (a) A graph with non-trivial separating 4-cycles. Note that some 4-cycles intersect each other, (b) A 
possible decomposition tree of 4-cycle-free graphs (root on the left). 



edges in V we again use the fact that a flip [x, i) depends only on flips (jc', /'), where x' belongs to the same 
triangle as x and /' differs by at most 1 from /. The actual dependencies can be obtained from their states in 

Next, we assign weights to the nodes in V. Let £min be the layout that is minimal in the distributive lattice, 
i.e., the layout where no more clockwise flips are possible. For an edge-flip node (e, /) let TZ and S be the 
two rectangle complexes adjacent across e. Then the weight (o{e,i) is obtained as follows. Starting with the 
adjacency direction between TZ and S in Cmin we cycle / times through the set D in counterclockwise fashion. 
Let d be the i-th direction and the (/+ l)-th direction. Then o(e. /) — co{e,d') — Co{TZ,S,d') — Co{TZ,S,d). 
For a vertex-flip node (v, /) let TZ be the degree-4 rectangle complex surrounded by the four complexes 
Si,. ..,84. We again determine the adjacency directions between TZ and Si,...,S4 in Cmin and cycle / 
times through D to obtain the i-th directions di,. ..,d4 as well as the ( / + 1 )-th directions d[,...,d'^. Then 
f^(^7 L/=i <J^{T^,Sj,d'j) — Co{TZ,Sj,dj). Equivalently, if the four edges incident to v are ei , ... ,64, we 
have 0){v,i) = 0){ej,d'j). 

Finally, we compute a maximum-weight closure of T^ using a max-flow algorithm |[T] Chapter 19.2], which 
will take 0{k"^logk') time for a graph with 0{k'^) nodes. 



Optimizing in General Graphs In this section, we show how to remove the restriction that the graph 
should have no separating 4-cycles. We do this by decomposing the graph G by its separating 4-cycles and 
solving the subproblems in a bottom-up fashion. 

Lemma 8 (Eppstein et al. f5\). Given a plane graph G with k vertices, there exists a collection C of 
separating 4-cycles in G that decomposes G into separation components that do not contain separating 
4-cycles any more. Such a collection C and the decomposition can be computed in 0{k) time. 

These cycles naturally subdivide G into a tree of subgraphs, which we will denote as Tq. Still following ||5), 
we add an extra artificial vertex inside each 4-cycle, which corresponds to filling the void in the subdivision 
after removing all rectangles inside by a single rectangle. Figure |9] shows an example of a graph G and a 
coiTesponding tree Tg. 

Now, all nodes of Tq have an associated graph without separating 4-cycles on which we can apply Theorem|2] 
The only thing left to do is assign the correct weights to the edges of these graphs. For a given node v of Tq, 
let Gv be the subgraph of G associated to v (with potentially extra vertices inside its 4-cycles). 



Adjacency-Preserving Spatial Treemaps 1 1 



For every leave v of Tq, we assign weights to the internal edges of Gy by simply setting a)(e, d) = (£i{TZ, S, d) 
if e separates TZ and S in the global layout C. For the external edges of Gy (the edges that are incident to 
one of the "comer" vertices of the outer face), we fix the orientations in the four possible ways, leading to 
four different problems. We apply Theorem |2] four times, once for each orientation. We store the resulting 
solution values as well as the corresponding optimal layouts at v in Tq. 

Now, in bottom-up order, for each internal node v in Tq, we proceed in a similar way with one important 
change: for each child pL of v, we first look up the four optimal layouts of /i and incorporate them in the 
weights of the four edges incident to the single extra vertex that replaced G^, in Gy. Since these four edges 
must necessarily have four different orientations, their states are linked, and it does not matter how we 
distribute the weight over them; we can simply set the weight of three of these edges to and the remaining 
one to the solution of the appropriately oriented subproblem. The weights of the remaining edges are derived 
from C as before, and again we fix the orientations of the external edges of Gy in four different ways and 
apply Theorem[2]to each of them. We again store the resulting four optimal values and the corresponding 
layouts at v, in which we insert the correctly oriented subsolutions for all children of v. 

This whole process takes Oik'^ log k) time in the worst case. Finally, since weights are expressed as differences 
with respect to the minimal layout we compute the value of C^m and add the offset computed as the 
optimal solution to get the actual value of the globally optimal solution. This takes 0{n) time. 

Theorem 3. Let T be a 2-level treemap, such that the extended dual graph G of the global layout has 
no separating 3-cycles. Let n be the number of cells in the bottom level and k the number of regions in 
the top level. Then we can find a rectangular subdivision that preserves all oriented internal bottom-level 
adjacencies, and preserves as many oriented external bottom-level adjacencies as possible in 0{k'^ log A; + «) 
time. 



4 Without preserving orientations 

In this section we study the variant of the problem where we do not need to preserve the orientations of the 
adjacencies that we preserve. We still assume that the required global layout of the output treemap is given 
in advance. 

We first define an adjacency graph on the boundary cells of all rectangle complexes. There is a vertex in this 
graph for each cell that belongs to the boundary of a rectangle complex. Since the global layout is given, we 
know where the four comers separating the boundary sides are. There is an internal adjacency edge between 
any two vertices of the same rectangle complex whose cells are adjacent in the complex. There are external 
adjacency edges between vertices of cells of different rectangle complexes if the cells are adjacent in the 
input. 

Next we note that the adjacency graph as a subgraph of the dual graph for the input layout is a planar graph. 
Our goal is to select subsets of the vertices to be on the boundary of each rectangle complex whose induced 
subgraph has as many external adjacency edges as possible but also would not create any separating triangles 
if we imagine connecting all the external adjacencies corresponding to one boundary to one vertex. 

For the remainder we restrict us to the case of two top-level regions, and only at the very end extend the 
arguments to more regions. In our subgraph of the adjacency graph we remove those internal adjacency 
edges that correspond to two directly neighboring cells when traversing the boundary of the corresponding 
rectangle complex. In the remaining graph, we need to find an independent set in terms of the remaining 
internal adjacency edges, since any two adjacent vertices in this graph would induce a separating triangle. 
We first prove that preserving as many bottom-level extemal adjacencies as possible is NP-hard already for 
the case of two top-level regions. 
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X abed y e f g h z 



Fig. 10. equality gadget x^z 



4.1 NP- hardness 

In positive l-in-3-SAT each clause contains exactly three (non-negated) variables, and we need to decide 
whether there is a truth assignment such that exactly one variable per clause is true. As input we are given 
the collection of clauses together with a planar embedding of the associated graph such that all variables 
are on a straight line and no edge crosses the straight line. This problem was shown to be NP-complete by 
Mulzer and Rote p2) . We reduce from the following variant of positive l-in-3-SAT. 

Lemma 9. Planar positive l-in-3-SAT with variables on a line and with every variable occurring in at least 
one clause on each side of the line and in at most three clauses is NP-hard. 



Proof. Consider the equality gadget in Figure 10 It enforces that x and z are equivalent. The equality gadget 
consists of the same clauses as the corresponding gadget in (12], but it arranges them such that any variable 
other than x and z occurs in a clause on each side of the line. Concatenating equality gadgets gives us a 
sequence of equivalent variables xq,... ,xii+i, where xq and x^-^i occur in exactly one clause and the other 
variables occur in exactly two clauses, one clause on each side of the line. Given an instance of planar 
positive l-in-3-SAT, we replace any variable that occurs in A; > 1 clauses by such a sequence, and use 
xi, . . . ,xii to connect to one additional clause each. We connect any remaining variable x that only occurs in 
one clause (like xq and x^+i) to two new identical clauses {xVaV b) and {xVaV b) with additional variables 
a and b, one on each side of the line. □ 
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Fig. 11. A set of rectangles representing a variable. Left: 1 occurrence above the line, 2 below; middle: 2 above, 1 below; 
right: 1 and 1. 



In the reduction we have two top-level regions with a staircase boundary between them. For each variable 
we have a variable gadget consisting of a set of rectangles on both sides of the boundary. Figure [TT] shows 
the three variable gadgets used. Which gadget is used, depends on the number of occurrences in clauses. 
From the V- and A^-rectangles we can on each side of the boundary only keep an independent set (in terms 
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of the adjacency graph) since any adjacent pair would induce a separating triangle due to the A-rectangles. 
A V-rectangle extends beyond its vertical and/or its horizontal dotted line; we call such a V-rectangle 
connecting. (9-rectangles are only placed opposite to those V-rectangles that are extended in this way. The 
complete boundary is filled with vertex gadgets. 

A clause is represented by three adjacencies: for a clause on the left of the boundary, from the left variable 
gadget a V-rectangle extends upwards and from the right variable gadget a V-rectangle extends to the left, so 
that the two rectangles touch. From the middle vertex gadget a V-rectangle extends to the left and upwards, 
so that it touches the two other rectangles. The case to the right of the boundary is analogous. We do this for 
every clause. Finally any empty spaces are filled with rectangles greedily. 

For the moment assume that for a variable either all V-rectangles (and no A^-rectangles) are kept on the 
boundary (this corresponds to setting a variable to true) or all A^-rectangles (and no V-rectangles) are kept 
(this corresponds to setting a variable to false). Additionally we keep as many A- and O- rectangles as 
possible. In the case that we keep all V-rectangles we can achieve 10 or 11 adjacencies depending on whether 
the variable occurs 2 or three times. Thus, we achieve 8 adjacencies plus the number of occurrences of the 
variable in clauses. In the case that we keep all A^-rectangles we can achieve 8 adjacencies. Additionally, we 
get 1 adjacency for each pair of neighboring vertex gadgets. For the three variable gadgets involved in the 
same clause gadget, we can for at most one keep all V-variables since the clause gadget would otherwise 
yield a separating triangle. Thus if we have m variables and n clauses, the total number of adjacencies is 
Sm+m— 1 plus the number of occurrences of variables that have been set to true. Now if the original 
formula has a satisfying assignment with exactly one variable true per clause then the number of adjacencies 
we can achieve in this way is 9m — 1 + n, and if there is no such assignment the number of adjacencies is 
smaller 

It remains to show that we can indeed assume that for a vertex gadget we have only V-rectangles or only 
A^-rectangles. First observe that as long as we keep all connecting V-rectangles on the boundary, there is 
no advantage in dropping any of the remaining ones. It remains to prove that we do not get more than 8 
adjacencies if we do not choose to keep all connecting V-rectangles on the boundary; this implies that such 
a configuration has no advantage over the one with all A^-rectangles, thus we can replace it by the later. 

We refer to external adjacencies by the symbols of the rectangles, e.g., we call an adjacency between a V- 
and an A-rectangle a V-A adjacency. We now go through all cases. If we keep the A^-rectangles on 1 side 
and the V-rectangles on the other side then we get no V-V or A^-A^ adjacencies, 3 V-A, 3 A^-V and at most 2 
V-0 adjacencies, thus at most 8. There is one configuration in which we keep a pair of externally-adjacent 
V-rectangles and a pair of externally-adjacent A^-rectangles. In this case we have 1 V-V, 1 A^-A^, 2 V-A, 
2 N-A, and at most 2 V-0 adjacencies, thus at most 7. If we keep 3 V-rectangles, but not all connecting 
ones, then we loose at least 3 adjacencies, so we keep at most 8. All other cases give subsets of the cases 
above, and leave us with even fewer (at most 7) adjacencies. Thus, the assumption was valid. From this the 
following theorem follows. 

Theorem 4. Given an input subdivision, finding a rectangular subdivision that respects the global layout, 
preserves all internal adjacencies, and preserves as many bottom-level external adjacencies as possible is 
NP-hard. 



4.2 Upper bound 

We now show that it is sometimes not possible to preserve more than a factor 1/4 of the external adjacencies. 

We will construct an example with Ah + 1 boundary rectangles (and therefore Ah + 6) external adjacencies 
between two regions. To be more precise, we will construct a graph which we show is the dual graph of a pair 



of rectangle complexes. Figure 12 illustrates the construction. The graph consists of two pieces (one for each 
rectangle complex). The top piece has 7 vertices, 3 of which are isolated and 4 of which are connected by a 
maximal outerplanar graph. The bottom piece consists of Ah vertices, in four groups of h. The first and third 
groups of vertices are all isolated, while the vertices of the second and fourth groups are pairwise connected. 
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Fig. 12. Illustration of the construction showing that we cannot keep more than a quarter of the external 
adjacencies. Black edges indicate internal adjacencies; orange and blue edges are external adjacencies. Blue 
edges are incident to an isolated vertex in the top graph. 

Finally, we add a complete planar bipartite graph between the two groups of vertices by connecting each 
group of h vertices to one of the 4 connected vertices in the top graph, and filling the gaps with 6 additional 
edges. 

Claim. We cannot preserve more than /i + 6 edges in this construction. 

Proof. Recall that we need to take an independent set in both the top and bottom outerplanar graph. This 
implies that in the top graph, if we keep the first or the fourth vertex, we cannot keep any of the other three. 
Therefore, we can either keep the complete first or the third group of h edges, or a combination of edges in 
the second and fourth group together However, each edge in the second group is connected to an edge in the 
fourth group via an intemal adjacency in the bottom graph, so we can keep at most half of the edges in these 
groups together: at most h. Since there are only 6 edges not in a group, we can keep at most /i + 6 edges. 

□ 

If desired, we can create another copy of the construction and place it upside down to have a non-constant 
number of vertices at the top as well as at the bottom. 

Theorem 5. Given an input subdivision, there generally exists no rectangular subdivision that respects the 
global layout, preserves all internal adjacencies, and preserves more than 1 /4f/! of the bottom-level external 
adjacencies. 

4.3 Algorithm 

We first describe an algorithm for two top-level regions. The connected components of internal adjacency 
edges form outerplanar graphs on which we can solve the maximum weight independent set problem in 
linear time exactly (where the weight of a vertex is its degree in terms of external adjacency edges). We first 
solve the maximum weight independent set problem for one of the sides of the boundary exactly and then 
solve it for the other side using only the adjacencies with rectangles from the maximum weight independent 
set. 

By three-coloring the outerplanar graph we see that the weight of the independent set is at least a third of the 
total weight since the maximum weight of a colour class is a lower bound for the weight of the independent 
set. Applying the same argument to the other outerplanar graph, shows that we preserve at least a third of 
the remaining adjacencies. Thus overall we keep at least 1 /9th of the external adjacencies. Furthermore, the 
weight of the independent set for the first side is an upper bound on the number of external adjacencies that 
we can preserve. Since we preserve a third of this weight, our algorithm is a 1 /3-approximation. 

For more than two top-level regions the choices of which rectangles to place on a boundary are not 
independent. Instead of solving the problem for the boundaries between two regions we solve it for the line 
segments of the global layout (with possibly more than one region on each of the sides of the line segment). 
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The choices of which rectangles to keep adjacent to line segments are also not independent, but if we only 
consider horizontal or only vertical line segments they are. By optimizing only for horizontal or only for 
vertical line segments, we again loose at most a factor of 1 /2 in terms of the number of adjacencies preserved. 
Thus, overall we can preserve at least 1 /18th of the external adjacencies and obtain a 1/6-approximation. 
We can compute a corresponding rectangular subdivision in linear time |8|. 

Theorem 6. Given an input subdivision, we can find a rectangular subdivision that respects the global 
layout, preserves all internal adjacencies, and preserves at least l/18f/z of the bottom-level external 
adjacencies in 0{n) time. 

Corollary 1. Given an input subdivision, let s be the maximal number of internal adjacencies preserved in 
any rectangular subdivision that respects the global layout, preserves all internal adjacencies. We can find a 
rectangular subdivision that respects the global layout, preserves all internal adjacencies, and preserves at 
least 1 /6s bottom-level external adjacencies in 0{n) time. 
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